perm filename MCDERM[S86,JMC] blob sn#825972 filedate 1986-10-08 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00003 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	%mcderm[s86,jmc]		Notes on McDermott's Lament
C00015 00003	\medskip
C00016 ENDMK
C⊗;
%mcderm[s86,jmc]		Notes on McDermott's Lament
\input memo.tex[let,jmc]
\centerline{\bf Logic in AI: Notes on McDermott's Lament}
\smallskip
\centerline{by John McCarthy, Stanford University}
\medskip
	When such a prominent advocate of the use of logic in AI
as Drew McDermott becomes discouraged and publishes a recantation
like [McDermott 1986], it behooves us logic users to
pay careful attention.  Even if we are not also discouraged, we
may find sharper formulations of the problems we face than are
provided by the long term opponents of rationalism like the Dreyfuses,
Searle and Winograd and the opponents of logic within AI like
Minsky and Schank.

	I have read McDermott's opus.  He has indeed found some
difficulties, but I am not discouraged.  I never thought the
logic approach to AI, or any other approach would be easy.

	This commentary emphasizes the points that give the most
difficulties.  My optimism is mainly based on the possibilities provided
by non-monotonic reasoning.  We have barely begun to
explore the different kinds of non-monotonic reasoning that are possible
and the options they provide for expressing common sense knowledge.
Indeed I will mention some previously unpublished approaches in this
note.

	Even if the present situation were as gloomy as McDermott
says, I would still pursue the possibilities of using logic and
making suitable extensions, largely because that's what I know
how to do.  Also as long as other approaches, left undefined
by McDermott, don't solve the problems that give the logical
approach difficulty, logic represents one of the alternatives.

\medskip
\noindent{\bf Problems with non-monotonic reasoning}

	In [McCarthy 1986], I introduced the idea of a {\it simple
abnormality theory (SAT)}.  In a SAT there is a single predicate $ab$
to be circumscribed and all other predicate, function and individual
constants are to be taken as variable in achieving the minimum.
This makes the outcome of the circumscription depend entirely on
the axioms and other
sentences taken into account.  This property will be a requirement for a
database of common sense knowledge which is a major goal
of research into the epistemology of common sense.

	In previous discussions of circumscription, e.g. [McCarthy 1980]
and other sections of [McCarthy 1986], it was prescribed
what was to be circumscribed in each case.  This is fine for
studying the properties of circumscription as a form of
constrained minimization in logic, but its use in an
AI program would require that decisions about what to circumscribe
be made by the program.  It is much better to have a standard
circumscription policy so that the conclusions reached depend
only on what facts are taken into account.

	While simple abnormality theories are definite enough,
it was already stated in [McCarthy 1986] that they weren't
general enough.  Prioritized circumscription was required in
some known cases, but it wasn't obvious how to include the
priorities in the axioms.  Vladimir Lifschitz found
a simple blocks world example that shows the (first) difficulty,
and this motivated his development of pointwise circumscription
and subdomain circumscription.  Unfortunately,
there still isn't a straightforward analog of simple abnormality theories
using these new forms of circumscription.  There is apparently
no fundamental reason why such an analog can't be found.
Lifschitz's example is similar to McDermott's shooting-of-Fred
example.

	We now describe Lifschitz's blocks world example.  Suppose
that action $a1$ consists of trying to put block $A$ onto
block $B$ and action $a2$ consists of trying to put block $B$
onto block $C$.  We are assuming a blocks world in which a
block cannot be moved if there is something on top of it
but can normally be moved otherwise.  Here is a simple first order
axiomatization using abnormalities.
%
$$\displaylines{¬ab1 ⊃ a1,\leqno(1)\hfill\cr
a1 ⊃ ¬a2,\leqno(2)\hfill\cr
¬ab2 ⊃ a2.\leqno(3)\hfill\cr}$$

	Suppose we further know that initially all three blocks
are on the table.  Circumscribing $ab1$ and $ab2$ in
parallel in these axioms yields two minimal models.
We can write these models $\{¬ab1,a1,ab2,¬a2\}$ and
$\{ab1,¬a1,¬ab2,a2\}$.  The first model is what we expect.
There is no reason for $a1$ to be unsuccessful, i.e. to be false, so it succeeds
and prevents $a2$.  In the second model, $a1$ fails for no
apparent reason, and so $a2$ can succeed.  This doesn't correspond
to our expectations, so we would like to do the non-monotonic
reasoning in such a way as to eliminate this model.

	We can do it using prioritized circumscription where
we give minimizing $ab1$ higher priority than minimizing $ab2$.
The difficulty is figuring out how to justify these priorities
in a general way and either putting them in the database or
making them a consequence of a general circumscription policy.
It was suggested that we give earlier actions priority over
later actions, amd this would work out for programs of limited
scope, but sometimes we want to reason about the past from the present.
We'll look for a more general solution, perhaps accepting temporal
priorities as a temporary expedient if we can't soon find it.

	What we really want to say is that we know of no reason
why $a1$ should fail, while the success of $a1$ provides a reason
for the failure of $a2$.  That's fine, but actually doing it will
be a lot of work.  We will have to formalize what we know and
don't know, and this means reifying propositions as in [McCarthy 1979]
and reifying reasons or causes.  We'd better analyze the problem
more generally before we plunge into all this reifying.

	Notice the following.  Axiom (2), namely, $a1 ⊃ ¬a2$
can be written symmetrically as $¬a1 ∨ ¬a2$, and the remaining
two axioms taken together are symmetric in the indices 1 and 2.
Thus for every model of their conjunction in which $a1$
succeeds, there is an isomorphic model in which $a2$ succeeds.
Therefore, we won't get what we want by applying model theory
to axioms 1-3 taken as logical expressions.  We have either to
pay attention to what is on which side of the $⊃$ symbol or
change the notation to one which is explicit about causality.

	We can learn something from the fact that a similar
problem arises in assigning semantics to Prolog programs that
include negation.  In fact we can write (2) as a Prolog
statement
%
$$\displaylines{a2 \hbox{:-} not a1.\leqno(4)\hfill\cr}$$
%
It says that $a2$ is considered to succeeed only if $a1$ is known
to fail, and thus does what we want.  Indeed Reiter pointed
out at the 1984 Non-monotonic Reasoning Conference that the
blocks axioms translated readily into a Prolog program.

	Writers about the semantics
of Prolog have discussed model theoretically how to interpret
Prolog programs with negation, i.e. how to characterize which
models of the logical sentences making up the program are
actually wanted, i.e. correspond to the operational semantics.
Reasonable results are obtained only for a subset of Prolog
programs --- those that are stratified in a suitable way
so that infinite loops of negations are prevented by the
syntax.

	Our goal is different.  We can't limit ourselves to Horn
clauses, we want to use functions and equality, and we would like
to obtain a proof-theoretic as well as a model-theoretic result.
Namely, we want a formula, perhaps of second order logic, whose
models are just the ones we want.  Moreover, the process of getting
the formula must be general so that we can apply it to any
collection of facts we choose to take into account.

	There is no reason to expect this to be an easy problem
in general.  Admitting non-monotonic reasoning means discussing
ordering relations among the models of sets of axioms.  How to
express the desired orderings by adding sentences to the
original axioms isn't clear.  It isn't even clear as yet whether
it can be done short of formalizing the metamathematics, i.e.
writing down sentences about the relations among the models
and between the models and the sentences.

	The first step is to get rid of the symmetry by
modifying axiom 2.   Actually we'll modify all the axioms.
\medskip
\noindent The first version of mcderm[s86,jmc] is dated 1986 June 23.
\noindent This version \TeX ed on \jmcdate\space at \theTime.
\vfill\eject\end